Parmesan: Meteor without Paraphrases with Paraphrased References
نویسنده
چکیده
This paper describes Parmesan, our submission to the 2014 Workshop on Statistical Machine Translation (WMT) metrics task for evaluation English-to-Czech translation. We show that the Czech Meteor Paraphrase tables are so noisy that they actually can harm the performance of the metric. However, they can be very useful after extensive filtering in targeted paraphrasing of Czech reference sentences prior to the evaluation. Parmesan first performs targeted paraphrasing of reference sentences, then it computes the Meteor score using only the exact match on these new reference sentences. It shows significantly higher correlation with human judgment than Meteor on the WMT12 and WMT13 data.
منابع مشابه
Machine Translation within One Language as a Paraphrasing Technique Online
We present a method for improving machine translation (MT) evaluation by targeted paraphrasing of reference sentences. For this purpose, we employ MT systems themselves and adapt them for translating within a single language. We describe this attempt on two types of MT systems – phrase-based and rule-based. Initially, we experiment with the freely available SMT system Moses. We create translati...
متن کاملHIT2016@DPIL-FIRE2016: Detecting Paraphrases in Indian Languages based on Gradient Tree Boosting
Detecting paraphrase is an important and challenging task. It can be used in paraphrases generation and extraction, machine translation, question and answer and plagiarism detection. Since the same meaning of a sentence is expressed in another sentence using different words, it makes the traditional methods based on lexical similarity ineffective. In this paper, we describe a strategy of Detect...
متن کاملMultilingual WSD-like Constraints for Paraphrase Extraction
The use of pivot languages and wordalignment techniques over bilingual corpora has proved an effective approach for extracting paraphrases of words and short phrases. However, inherent ambiguities in the pivot language(s) can lead to inadequate paraphrases. We propose a novel approach that is able to extract paraphrases by pivoting through multiple languages while discriminating word senses in ...
متن کاملFUN-NRC: Paraphrase-augmented Phrase-based SMT Systems for NTCIR-10 PatentMT
This paper describes FUN-NRC group’s machine translation systems that participated in the NTCIR-10 PatentMT task. The central motivation of this participation was to clarify the potential of automatically compiled collections of sub-sentential paraphrases. Our systems were built using our baseline phrase-based SMT system by augmenting its phrase table with novel translation pairs generated by c...
متن کاملMeteor Universal: Language Specific Translation Evaluation for Any Target Language
Parameter set learned using all WMT12 data (Callison-Burch et al., 2012): • 100,000 binary rankings covering 8 language directions. •Restrict scoring for all languages to exact and paraphrase matching. Parameters encode human preferences that generalize across languages: •Prefer recall over precision. •Prefer word choice over word order. •Prefer correct translations of content words over functi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014